Nmr-based metabolite screening platform

ABSTRACT

Methods that enable one to specifically measure the metabolic product of a particular molecule in relatively few cells, e.g. primary cells, are described. The methods involve optionally preloading cells with labeled substrate (e.g. labeled by  13 C,  15 N, or  31 P). The methods allow for easy identification of metabolites that are differentially generated in cells of different phenotypes. The new methods for unbiased multi-dimensional NMR screening and rapid and efficient analysis of the NMR screening identify differentially expressed metabolites in different cell or tissue types. Analysis of the differentially expressed metabolites can present unique druggable targets to which small molecule therapeutics can be designed.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a continuation of and claims priority to U.S. patentapplication Ser. No. 14/377,257, filed on Aug. 7, 2014, which claimspriority to International Patent Application Serial No.PCT/US2013/025628, filed on Feb. 11, 2013, which claims benefit of priorU.S. Provisional Application Ser. No. 61/597,298, filed on Feb. 10,2012. The above applications are incorporated herein by reference intheir entirety.

STATEMENT AS TO FEDERALLY SPONSORED RESEARCH

The inventions were made with Government support under R21 AI087431awarded by the National Institutes of Health. The government has certainrights in the inventions.

FIELD OF THE INVENTION

The invention relates to NMR-based screening platforms.

BACKGROUND OF THE INVENTION

The metabolic output of a cell is the summation of the functionalgenomic, transcriptomic and proteomic networks that define that celltype. Metabolomics is the comprehensive and simultaneous systematicdetermination of metabolite levels in the metabolome and their changesover time as a consequence of stimuli. While other fields may provideinformation, for example, regarding the copy number of a given gene,mRNA or protein; this study of chemical processes involving metabolitesprovides the downstream summation of all aberrant genes, RNAs, and/orproteins. This ‘metabolic fingerprint’ represents a snapshot of all thefunctioning or non-functioning pathways in a particular cell type.

Several analytical methods including mass spectrometry, chromatography,and NMR spectroscopy have been used to quantify cellular metabolites.Mass spectrometry and chromatography both require small sample amountsand can be easily adapted for high throughput analysis; however, bothmethods typically involve at least one if not several purificationsteps. Furthermore, in most cases the metabolites to be examined must bepre-selected a priori. Untargeted mass spectrometry approaches arepossible but require several rounds of purification and furtheridentification methods. In addition, not all metabolites, includingnucleotide analogs and lipids, are easily ionizable and thus cannot bedetected via mass spectrometry. Further, the fragmentation patternresulting from mass spectrometry is not always suitable to distinguishbetween molecules such as sugars that have equal mass, but differentstructures, hence limiting the analysis.

SUMMARY OF THE INVENTION

Described herein is a rapid, unbiased, ultra-high resolution,quantitative NMR screening platform that utilizes any one, two, three,four, or all five, in any combination, of the following techniques:stable isotope labeling of a substrate, spectral width folding, randomphase sampling, non-uniform sampling, and data extension for enhanceddynamic range data reconstruction, to generate custom “NMR MetaboliteArrays” in which the resonances of all known metabolites of a given cellsample are categorized and used for comparison for simplifiedstatistical analysis. This is the first time all these techniques havebeen combined to provide a robust, efficient and high throughput NMRmetabolite screening protocol. The combinations of steps allow forglobal, unbiased, ultra-high resolution of both water-soluble andlipid-based metabolites. In addition, the novel “NMR Metabolite Array”programs described herein provide a new way to analyze large complex NMRdatasets in a simplified manner. The new platform permits both the rapididentification of differentially expressed metabolites, quantificationof specific metabolites, and the ability to analyze the metabolic fluxof given precursors.

The new methods enable one to specifically follow the metabolicbreakdown of a particular molecule in relatively few cells, e.g., about2-20 million cells. The methods involve preloading cells with a labeledprecursor substrate (e.g., labeled with ¹³C, or ¹⁵N, or ³¹P) and usingmultidimensional NMR. The methods do not require purification of theindividual metabolites of interest prior to analysis allowing forglobal, unbiased identification of metabolites that are differentiallygenerated in cells with different properties.

Identification of metabolites differentially expressed in normal anddisease state cells can be a powerful tool in the clinic. The newmethods for monitoring differential expression of metabolites from cellsthat are phenotypically different are particularly useful foridentifying therapeutic targets that can be used to modulate thephenotype. This includes targets that are present in the biosyntheticpathway of the metabolite, or the metabolite itself. Further,identifying differentially expressed metabolites can be used todifferentiate cells of a normal versus a disease state. Hence, they havethe potential to serve as a biomarker for the phenotype with which it isassociated, making the methods described herein useful for identifyingdiagnostic markers, e.g., markers for diagnosis of disease.

For example, described herein we identified N-acetylneuraminic acid(NANA) as a novel biomarker for breast cancer tumor initiating cells,and monitoring its expression could be useful in diagnosing anddetecting breast cancer. In addition, the protein level of CMAS, anenzyme in NANA biosynthesis was shown to be dramatically over-expressedin breast tumor initiating cells. Before this work, the role of CMAS intumor initiation and metastasis has not been explored. Herein we provideevidence that CMAS expression is absolutely crucial for tumor formationand migration and that CMAS is a novel bona fide target for breastcancer.

The application of this methodology to an individual patient's cellanalysis will also provide the basis for a “personalized medicine”approach to patient care.

Accordingly, this disclosure describes methods for monitoring themetabolism of a given substrate precursor within a cell population,e.g., a primary cell population, a tissue cell population, or culturedcells (e.g., immortalized cells). Using the methods described herein,the identification of differentially expressed metabolites between twoor more cell populations that have different phenotypes is described.Also described are methods for identifying potential therapeutic targetsand diagnostic markers.

In general, in a first aspect, the disclosure features new methods ofmonitoring metabolism of a substrate within a given type of cell in asample. The new methods include (a) culturing a given type of cell of afirst sample with a substrate for a sufficient period of time to allowmetabolic breakdown of the substrate into substrate metabolites, whereinat least a portion of the substrate is optionally labeled with a nuclearmagnetic resonance (NMR) stable isotope; (b) harvesting the substratemetabolites from the cells of step (a) to obtain a second sample ofsubstrate metabolites; and (c) performing multi-dimensional NMR on thesecond sample of step (b) to determine a resonance spectrum of themetabolized substrate, wherein the resonance spectrum represents themetabolites of the substrate, and wherein the multi-dimensional NMRcomprises any one of the following techniques: spectral width folding,random phase sampling, non-uniform sampling, and data extension forenhanced dynamic range data reconstruction.

In another aspect, the disclosure features methods for identifyingdifferentially expressed substrate metabolites between a firstpopulation of cells and a second population of cells. These methodsinclude (a) optionally loading a first and a second population of cellswith a nuclear magnetic resonance (NMR) stable isotope-labeledsubstrate; (b) culturing the first and the second population of cells ofstep (a) for a sufficient period of time to allow metabolic breakdown ofthe substrate into substrate metabolites; (c) harvesting the substratemetabolites from the first and the second population cells of step (b)to obtain a sample of substrate metabolites from each of the first andthe second cell populations; (d) performing multi-dimensional NMR on thesample of step (c) for each of the first and the second cell populationsto determine a resonance spectrum of the metabolized substrate of thefirst population of cells and of the second population of cells, whereinthe resonance spectrum represents the metabolites of the substrate; and(e) comparing the resonance spectrum of the first population of cellswith the resonance spectrum of the second population of cells todetermine which resonances are differentially expressed, wherein thedifferentially expressed resonances provide a resonance signature thatrepresents differentially expressed metabolites.

In any of these methods, the multi-dimensional NMR can include any two,three, or all four, in any combination, of the following techniques:spectral width folding, random phase sampling, non-uniform sampling, anddata extension for enhanced dynamic range data reconstruction. In any ofthese methods, the substrate can be labeled with an NMR stable isotopeand the multi-dimensional NMR can include any two, three, or all four,in any combination, of the following techniques: spectral width folding,random phase sampling, non-uniform sampling, and data extension forenhanced dynamic range data reconstruction.

In some implementations of these methods, the substrate metabolites thatare present in the sample are not purified away from the other moleculesin the sample. In some implementations the substrate concentrationwithin the population of cells is reduced for a period of time prior toloading the cells with the NMR-labeled substrate and the resonances ofthe metabolites of the labeled substrate are determined using NMR pulseprograms or filtering techniques, or both, customized to the substrate.The number of cells within the population of cells can be is less than2×10⁶ and the population of cells can be a primary population of cells.

Any of these methods can further include comparing the resonancesignature of step (e) with a database of known resonance signatures todetermine the molecular structure that the resonance signaturerepresents, and thereby determine the substrate metabolites that aredifferentially expressed between the first and the second populations ofcells.

In certain implementations, the methods can further include identifyinga biosynthetic pathway involved in generation of the substratemetabolites and identifying proteins/enzymes of the pathway that may betargeted to modulate the differential expression of the metabolite, tothereby modulate the phenotype of the cells. In some embodiments thefirst population of cells and the second population of cells areisogenic populations and/or the first population of cells and the secondpopulation of cells have different phenotypes. In some implementations,the first population of cells is a control population of cells and thesecond population of cells has been contacted with a test compound oragent. The methods can be used to identify metabolic pathways that areoveractive or underactive in a particular cell type. The methods canfurther include inhibiting or overexpressing a gene in the secondpopulation of cells and the method is used to identify the metabolicconsequences of over-expressing or inhibiting a gene in a cell.

In another aspect, the disclosure features methods for treating cancerin a subject. The methods are based on results determined using the newplatform methods described herein. The methods of treating cancerinclude administering to a subject in need thereof an effective amountof an inhibitor of N-acylneuraminate cytidylyltransferase (CMAS), aninhibitor of N-acetylneuraminic acid synthase (NANS), or a molecule thatdecreases the expression of N-acetylneuraminic acid. For example, theinhibitor can be an enzyme or can be selected from the group consistingof a small molecule, a ribonucleic acid, a deoxyribonucleic acid, aprotein, a peptide, and an antibody.

As used herein, the term “isogenic” refers to cells of the same geneticbackground (any cell type, e.g., epithelial, or fat, or stem, or musclecells etc.) that are isolated from the same tissue type (any tissue,e.g., tissue of the same organ, skin, bladder, liver, heart, etc.) andfrom the same organism type (e.g., human, or animal, or fish).

As used herein, the term “metabolite” refers to the intermediate or theend products of metabolism. “Metabolites” have functions comprisingenergy source, structural, signaling, stimulatory, and inhibitoryeffects on enzymes. Metabolites can also have catalytic activitythemselves. A metabolite can be the end product of a substrate-enzymereaction.

As used herein, the term “metabolic precursor” is a compound thatparticipates in a chemical reaction. The term is meant to include to acompound that is a starting compound or an intermediate compound of anenzymatic reaction from which an end product results.

The term “substrate” refers to a molecule or compound on which an enzymeacts and results in the substrate transforming into one or more endproducts. The end products are released from the active site of theenzyme.

The term “enzyme” refers to a molecule that accepts a substrate in itsactive site and transforms the substrate into one or more end productsthat are subsequently released from the active site.

As used herein, the term “primary cell” or “primary tissue” refers tocells or tissue taken directly from living tissue of a normal individualor an individual with an acquired or inherited disease and establishedto grow in vitro.

The term “metastasis” refers to a process by which cancer spreads fromthe place at which it first arose as a primary tumor to distantlocations in the body as well as the newly established tumor itself,which is also referred to as a “metastatic tumor” that can arise from amultitude of primary tumor types, including but not limited to those ofprostate, colon, lung, breast, bone, and liver origin. Metastasesdevelop, e.g., when tumor cells shed from a primary tumor adhere tovascular endothelium, penetrate into surrounding tissues, and grow toform independent tumors at sites separate from a primary tumor.

The term “cancer” refers to cells having the capacity for autonomousgrowth. Examples include cells having an abnormal state or conditioncharacterized by rapidly proliferating cell growth. The term is meant toinclude cancerous growths, e.g., tumors (e.g., solid tumors); oncogenicprocesses, metastatic tissues, and malignantly transformed cells,tissues, or organs, irrespective of histopathologic type or stage ofinvasiveness. Also included are malignancies of the various organsystems, such as respiratory, cardiovascular, renal, reproductive,hematological, neurological, hepatic, gastrointestinal, and endocrinesystems; as well as adenocarcinomas which include malignancies such asmost colon cancers, renal-cell carcinoma, prostate cancer and/ortesticular tumors, non-small cell carcinoma of the lung, cancer of thesmall intestine, and cancer of the esophagus. Cancer that is “naturallyarising” includes any cancer that is not experimentally induced byimplantation of cancer cells into a subject, and includes, for example,spontaneously arising cancer, cancer caused by exposure of a patient toa carcinogen(s), cancer resulting from insertion of a transgeniconcogene or knockout of a tumor suppressor gene, and cancer caused byinfections, e.g., viral infections. The term “carcinoma” is artrecognized and refers to malignancies of epithelial or endocrinetissues. The term also includes carcinosarcomas, which include malignanttumors composed of carcinomatous and sarcomatous tissues. An“adenocarcinoma” refers to a carcinoma derived from glandular tissue orin which the tumor cells form recognizable glandular structures.

As used herein, the term “treating” or “treatment” refers toadministering one or more of the compounds described herein to a subjectwho has an a disorder treatable with such compounds, and/or a symptom ofsuch a disorder, and/or a predisposition toward such a disorder, withthe purpose to confer a therapeutic effect, e.g., to cure, relieve,alter, affect, ameliorate, or reduce the disorder, the symptom of it, orthe predisposition.

As used herein, the term “an effective amount” or “an amount effective”refers to the amount of an active compound that is required to confer atherapeutic effect on the treated patient. Effective doses will vary, asrecognized by those skilled in the art, depending on the types ofdiseases treated, route of administration, excipient usage, and thepossibility of co-usage with other therapeutic treatment.

Dosage, toxicity, and therapeutic efficacy of therapeutic compounds canbe determined by standard pharmaceutical procedures in cell cultures orexperimental animals, e.g., for determining the LD50 (the dose lethal to50% of the population) and the ED50 (the dose therapeutically effectivein 50% of the population). The dose ratio between toxic and therapeuticeffects is the therapeutic index and it can be expressed as the ratioLD50/ED50. Compounds that exhibit high therapeutic indices arepreferred. While compounds that exhibit toxic side effects may be used,care should be taken to design a delivery system that targets suchcompounds to the site of affected tissue in order to minimize potentialdamage to uninfected cells and, thereby, reduce side effects.

Unless otherwise defined, all technical and scientific terms used hereinhave the same meaning as commonly understood by one of ordinary skill inthe art to which this invention belongs. Although methods and materialssimilar or equivalent to those described herein can be used in thepractice or testing of the present invention, suitable methods andmaterials are described below. All publications, patent applications,patents, and other references mentioned herein are incorporated byreference in their entirety. In case of conflict, the presentspecification, including definitions, will control. In addition, thematerials, methods, and examples are illustrative only and not intendedto be limiting.

Other features and advantages of the inventions will be apparent fromthe following detailed description, and from the claims.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a flow chart summarizing the key steps of the NMR basedmetabolite screening platform described herein.

FIGS. 2A-F detail each step of the acquisition and processing in the NMRmethods that allow for rapid, unbiased, ultra-high resolution metaboliteprofiling. FIGS. 2A-2D demonstrate spectral width (sw) foldingstrategies, in which decreasing the sw from 220 pm to 90 ppm led to 2.5fold increase in resolution. FIG. 2E summarizes a non-uniform sampling(NUS) strategy allowing either an 8-fold increase in resolution or a4-fold increase in resolution in time reduction by 40%. FIG. 2F shows aNUS 13C-1H HSQC spectra (left) processed using forward maximum entropyreconstruction and the same NUS 13C-1H HSQC spectra (right) afterincluding a data-extension step in the reconstruction that increasedresolution by 2-fold.

FIGS. 3A and 3B are ¹³C-¹H HSQC spectra indicating the full metaboliccoverage of water-soluble (3A) and lipid-based (3B) metabolites from thesame p53 deficient mouse lung tumor, respectively.

FIG. 4 is a flow chart summarizing the custom NMR analysis program usedto create NMR arrays for rapid analysis.

FIGS. 5A-C are a set of ¹³C-¹H HSQC resonance spectra of water-solublemetabolites from 20 million unlabeled (5A) ¹³C-gluatmine incubated (5B)and ¹³C-glucose incubated (5C) breast tumor initiating cells.

FIG. 6 highlights the information available in the NMR arrays describedherein. Using the unlabeled, glutamine, and glucose spectra in FIGS.5A-C, a master look up was created to generate metabolite IDs for allpossible resonance metabolites that and these are listed on the X-axis.The Y-axis displays the relative intensity of each resonance in eachcondition, highlighting the differential glucose and glutamine derivedmetabolites.

FIGS. 7A-B are representative examples of ¹³C-¹H HSQC resonance spectraof water soluble glucose derived metabolites in breast tumor initiatingBPLER cells (FIG. 7A) and less malignant isogenic HMLER cells (FIG. 7B).

FIG. 7C summarizes the NMR arrays for the BPLER and HMLER ¹³C-¹H HSQCresonance spectra, showing how the intensity (Y-axis) of all possiblemetabolite resonances (X-axis) changes in each cell type. BPLER andHMLER cells originate from the same normal breast tissue and were growninto two cell types BPECs (breast primary epithelial cells) grown inchemically defined WIT medium and HMECs (human mammary epithelial cells)grown in MEGM media. BPEC and HMEC cells were transformed with hTERT(L), the SV40 early region (E), and H-ras (R) to give rise to BPLER andHMLER cells.

FIG. 7D is a zoomed in region of the a overlay of BPLER (red) and HMLER(blue) ¹³C-¹H HSQC spectra in a region the NMR array predicted to onlyhave BPLER resonances.

FIGS. 8A-C summarize various methods used to validate that thedifferentially expressed metabolite overexpressed in the BPLER NMRarrays is N-acetylneuraminic acid (NANA). FIG. 8A is the ¹³C-¹H HSQC ofpure NANA. FIG. 8B illustrates the results of custom NMR HCN(hydrogen-carbon-nitrogen) experiment, confirming BPLER cells have adifferentially expressed resonance with similar connectivity to NANA.FIG. 8C shows the M/S results directly measuring NANA in HMLER (left)and BPLER (right) cells using liquid chromatography-mass spectrometrymultiple reaction monitoring (LC/MS MRM) where the number reported isthe area under the NANA peak.

FIG. 9 is a series of representations of microscope images ofrhodamine-labeled wheat germ agglutinin (WGA) immune-fluorescentmicroscopy showing the expression of NANA in HMLER (top row) and BPLERcells (bottom row), respectively. WGA specifically binds to NANA.

FIG. 10 is a schematic diagram depicting the enzymatic steps to convertglucose to NANA, in which N-acetylneuraminic acid synthase (NANS) andN-acylneuraminate cytidylyltransferase (CMAS) are key enzymes.

FIG. 11 is a bar graph showing the effects on proliferation bydownregulating CMAS, NANS, and PLKI via siRNA on the viability of HMLERand BPLER cells.

FIGS. 12A-C are a series of images that provide an overview of the NANAeffect on cell migration of BPLER and HMLER cells. FIGS. 12A and 12B areeach a series of immunohistochemistry images showing the inhibition ofcell migration in the absence of NANS and CMAS and the rescue ofmigration with NANA in HMLER and BPLER cells, respectively. FIG. 12C isa bar graph quantifying the number of migrating cells in the absence ofNANS and CMAS and the subsequent migration following addition of NANS.

FIGS. 13A-B are a bar graph depicting the quantitative PCR (mRNA levels)of CMAS and cMYC (13A) and a Western Blot analysis of NANS and CMASexpression (13B), respectively, in HMLER and BPLER cells.

FIGS. 14A and 14B are immunohistochemistry images of the effect of CMASexpression on cell migration. FIG. 14A is a series ofimmunohistochemistry images of HMLER cell migration followingoverexpression of CMAS. FIG. 14B is a series of immunohistochemistryimages of BPLER cells and with stable CMAS knockdown BPLER cells(BPLER-shCMAS1).

FIGS. 15A-B summarize the strategy to determine the effect of CMASlevels on tumor initiation and metastasis in vivo. FIG. 15A shows theimmunization scheme, and FIG. 15B shows the tumor volume growth per dayafter in NOD/SCIN mice injected with 500,000 BPLER (left) or 500,000BPLER-shCMAS1 cells.

FIGS. 16A, 16B, 16C, and 16D are the enzyme mechanism of CMAS, asynthesized substrate based inhibitor of CMAS based on the structure ofNANA, and immunohistochemistry analysis showing the inhibition of cellmigration with the synthesized fluorine-NANA inhibitor, respectively.

FIGS. 17A and 17C are the chemical structures of(2R,3R,4S)-4-guanidino-3-(prop-1-en-2-ylamino)-2-((1R,2R)-1,2,3-trihydroxypropyl)-3,4-dihydro-2H-pyran-6-carboxylicacid (which is Zanamivir, marketed as Relenza®)(FIG. 17A) and ethyl(3R,4R,5S)-5-amino-4-acetamido-3-(pentan-3-yloxy)-cyclohex-1-ene-1-carboxylate(which is oseltamivir, marketed as Tamiflu®)(FIG. 17C).

FIGS. 17B and 17D illustrate the immunohistochemistry analyses of theeffects of Relenza® on BPLER cell migration (17D) and a control (17B).Both of these drugs are neuraminidase inhibitors.

FIGS. 18A and 18B are immune-fluorescent microscopy images showing theexpression of NANA in HMLER and BPLER cells in the absence and presenceof neuraminidase (18A) and immunohistochemistry images of HMLER andBPLER cell migration in the presence and absence of neuraminidase (18B).

DETAILED DESCRIPTION

The present disclosure describes novel methods that provide benefitsover numerous aspects of known NMR data acquisition and allow for rapid,unbiased, global, quantitative ultra-high resolution NMR dataacquisition and custom NMR analysis utilizing a novel approach. Thepresent disclosure describes new NMR screening methods that can be usedto identify, follow, and characterize the metabolic breakdown of aparticular molecule and to analyze the cellular metabolomics essentialto a given cell type. The methods described herein circumvent thehurdles presented by known NMR protocols, namely a reduction in thesample size necessary to perform the data acquisition, eliminating theneed for purification of the metabolites to be analyzed, reducedexperimental time required for multi-dimensional NMR and obtaining highresolution necessary for metabolite identification. The method requiresrelatively few cells (2-20 million), allowing the methods to be used tostudy metabolites from primary cells and tissues rather than just fromcell cultures. In addition, the disclosed methods do not requirepurification of the individual metabolites of interest from othercellular metabolites prior to analysis. Further, the new methods allowone to visualize the specific metabolic fate of a given precursor in anycell type, and thus provide for the simplified identification ofmetabolites that are differentially generated in different types ofcells utilizing novel data analysis methods also described herein.

To highlight the power and breadth of the new platform methods, thisdisclosure describes the identification of differentially expressedmetabolites in triple negative breast cancer tumor-initiating BPLERcells that are highly aggressive compared to a less malignant isogenicline HMLER. It is widely known in the field that cancer cells thrive onglucose consumption. The glucose metabolite N-acetyl-neuraminic acid(NANA) has been identified herein as being differentially expressed inbreast cancer tumor-initiating BPLER cells (i.e., increased expression).The biosynthetic pathway that generates the metabolite has also beenidentified and used to identify proteins/enzymes required for thesynthesis of the metabolite as candidate targets within the pathway tomodulate the phenotype of increased tumor initiation and metastaticpotential.

In addition, the results of knock-down experiments in which genes thatproduce key enzymes NANS and CMAS are silenced demonstrated that thereduction of the normal function of these enzymes to generate NANA andattach it to proteins had no effect on the proliferation of BPLER cells,but greatly reduced their migration. On the other hand, forcedover-expression of the same enzymes in HMLER cells increased theirmigration. Stable knockdown of CMAS in BPLER cells completely preventedtumor formation in mice. Thus, the new methods were successfully used toidentify metabolites (e.g., NANA) important for tumorigencity and todemonstrate that NANA and the proteins involved in the synthesis of NANAmay serve as targets for therapeutic intervention to reduce breast tumorformation and the metastasis of tumor-initiating cells. In addition,NANS, and CMAS were validated as targets for inhibition of tumorinitiation and metastasis in vitro and in vivo. Thus, small molecule andother inhibitors of these enzymes are new candidate therapeutic agentsthat can be used to specifically target breast tumor-initiating cells.Furthermore, the differentially expressed metabolites (e.g., NANA) canalso serve as biomarkers for the phenotype with which they areassociated, allowing the methods described herein to be used to identifynew candidate diagnostic markers.

General Methodology

In general, the new methods described herein for monitoring metabolitesof a given precursor molecule (i.e., substrate) within a given type ofcell include the steps of: (a) optionally loading a population of cellswith a labeled substrate, e.g., a ¹³C-labeled substrate, (b) culturingthe cells of step (a) for a sufficient period of time to allow metabolicbreakdown of the substrate into substrate metabolites (e.g., typically 5minutes to 24 hours depending on the experimental question); (c)harvesting the substrate metabolites from the cells to obtain a sampleof substrate metabolites, e.g., a water-soluble sample of substratemetabolites and organic sample of lipid-based metabolites; and (d)performing multi-dimensional NMR on the sample of step (c) to determinethe resonance spectra of the metabolized substrate, wherein theresonances represents the resonances of metabolites of the substrate.How the multi-dimensional NMR is performed is described in furtherdetail below.

In these methods, various substrates and various stable spin-½ nuclearisotopes can be used to label those substrates. For example, glucose,glutamine, fatty acids, amino acids, pyruvate, drug compounds, and othermolecules can be used as substrates, and stable isotopes such as ¹³C,¹⁵N, ²⁹Si, ³¹P, or others can be used to label the substrates. Anytissue, primary cells or cultured cell lines can be used for theanalysis, such as cancer cells, muscle cells, fat cells, endothelialcells, epithelial cells, neuronal cells, cardiac cells, and many others.In general, one would want to test cells associated with a particulardisease or disorder, such as a cancer cell, as well as the same type ofcells from a healthy subject, to provide a differential metabolicanalysis. One could also test cells with a single gene mutation (i.e.mutant cells vs. wild-type) or cells treated or not with a drug, or thesame cells incubated with the precursor for various times. The cellpopulations of each sample can be a homogeneous cell population. Inalternative embodiments, the cell population of each sample can be aheterogeneous cell population (e.g., derived from a tissue sample).While any number of cells can be used in the methods described herein,in various embodiments the number of cells within each population ofcells can be less than 1×10⁸, less than 8×10⁷, less than 7×10⁷, lessthan 6×10⁷, less than 5×10⁷, less than 2.5×10⁷, or less than 2.0×10⁷, orthe cell number can range from approximately 1×10⁶ to 1×10⁸ cells, or1×10⁶ to 5×10⁷, or 1×10⁶ to 2.5×10⁷, or 1×10⁶ to 2.0×10⁷.

Methods for loading cells are well known to those of skill in the art,e.g., the labeled substrate can be added to the cell culture medium fora period of time (e.g., 5 minutes to 24 hours), or may be loaded bytransfection (e.g., liposome or calcium phosphate), transduction (e.g.,viral delivery of labeled substrate) or transfusion (e.g. directinjection into a tumor). In one embodiment, labeled precursor isadministered to a subject, e.g., orally, topically, parenterally, orintravenously. In some embodiments, the labeled substrate is added toanimal feed.

In various embodiments, prior to adding the culture medium containingthe labeled substrate, the cells are incubated with cell culture mediumthat lacks any form of the substrate, e.g., lacks the substrate inunlabeled or labeled form. The cells may be incubated in this medium forminutes to hours, essentially starving the cells of the substrate. Thishelps reduce background in the later analysis. The cells are thenincubated with cell culture medium containing only labeled substrate,(e.g., 5 minutes to 24 hours). Any concentration of substrate can beused. In various embodiments, the concentration ranges from 1 ng/mlto >1 mg/ml. A skilled artisan can easily determine the bestconcentration to use by testing various concentration ranges. Afterincubation the cells are washed briefly and immediately harvested toseparate the metabolites. To harvest the metabolites, a simplechloroform extraction may be performed in order to obtain awater-soluble sample or a non-water soluble sample of metabolitespresent in the organic layer. No additional purification is required.

Several nuclear magnetic resonance (NMR) techniques may be used in themethods of the invention, preferably multidimensional NMR. For example,heteronuclear single quantum correlation (HSQC) spectroscopy, variationsof HSQC, and other multidimensional NMR techniques can be used. Methodsfor performing multidimensional NMR (e.g., 2D NMR and/or ¹³C-¹H HSQCNMR) are well known to those of skill in the art.

The resonance spectra of the metabolites of the labeled substrate can bedetermined using NMR pulse programs, which can be customized to thesubstrate. In general, NMR uses a static, homogeneous external magneticfield to polarize the NMR sample. This primary field is typically calledthe “B0” field, and it defines a reference axis for the NMR system. TheNMR sample is magnetized in the direction of the B0 field by placing thesample in the B0 field for a period of time (e.g., minutes) and allowingthe sample to reach a thermal equilibrium state. The primary B0 fieldalso typically defines the resonance frequencies of the spin-½ nuclearspecies in the sample. For example, a stronger primary field generallyincreases the nuclear spins' resonance frequencies. The nuclear spins“precess” about the B0 field at their respective resonance frequencies.In most NMR systems, the nuclear spins have a resonance frequency in theradio frequency (if) range.

In NMR experiments, the nuclear spins in the NMR sample are manipulatedby applying a time-varying magnetic field at the nuclear spins'resonance frequency. In some instances (e.g., for low flip-anglepulses), high-intensity radio frequency (RF) pulses provide fast,precise control of the nuclear spins. High-intensity RF pulses have thebenefit of shorter pulse times, which reduces the amount of decoherencethat occurs during the pulse. In some instances (e.g., for high-flipangle pulses), high-intensity RF pulses provide less precision, forexample, due to non-uniform power output over a frequency range ofinterest, due to spatial inhomogeneity in the RF field, or due to otherconsiderations.

In some implementations, adiabatic pulses can provide more precisecontrol of the nuclear spins. Adiabatic pulses typically have a lowerintensity and require a longer pulse time. In some cases, adiabaticpulses are used for larger flip-angle pulses (e.g., 180 degree flipangle) to provide a more uniform flip angle over the entire frequencyrange of interest. Adiabatic pulses are typically implemented shapedpulses (meaning that they have a time-varying power profile) that can beparameterized for a particular flip angle, a particular frequency range,etc.

In various embodiments, the new NMR methods include the use of any oneor more the following methods and techniques to increase analysis speedand/or resolution, or both: (i) stable isotope labeling (ii) folding thespectra width and aliasing peaks, (iii) random phase sampling (iv)non-uniform sampling, and (v) data extension. In some embodiments themethods include at least two or three of these methods in anycombination. In some embodiments, the methods include all five methods.Table 1 below summarizes the effects of each step on the NMR acquisitionand data resolution. These steps are described in more detail below.

TABLE 1 EFFECT ON NMR ACQUISITION & METHOD DATA RESOLUTION Stableisotople labeling Increases metabolite signal detection by 99% Foldingspectral width (sw) Increases resolution by 2.5 fold to 44 Hz(approaching theoretical limit of C-C decoupling) Random phase samplingDecreases the total acquisition time by 50% Non-uniform sampling Fortime equivalent spectra one can gain 8- fold increase in resolution orone can gain 4- fold increase in resolution and at the same time cutexperimental time by 40%. Data-extension 2-fold increase in dataresolution with no effect on experimental time

Stable Isotope Labeling

Traditional 2-D NMR metabolite profiling relies on ¹³C-naturalabundance, which exists at 1.1%. As such, in order to observe any signalat all, large amounts of sample are required (on the average +200million cells). This limits detection to cultured cells lines and onlythe most abundant metabolites are detected. To decrease the amount ofmaterial needed and view metabolites with a broader concentration range,samples (e.g., cells, tissues or tumors) were directly supplemented with¹³C-labeled precursors (glucose, glutamine, pyruvate and amino acidswere used but other substrates and other isotopes are also possible).Theoretically, this should decrease the sample burden by ˜99%. If, forexample a 1 mM metabolite required 200 million cells to be detected with¹³C natural abundance, using a label to see the same intensity wouldrequire only 2 million cells. Including this step in our method reducessample requirements to those similar for some mass spectrometry (M/S)approaches. Untargeted M/S requires at least 1 million cells and isfollowed by several rounds of chromatography purification and detection.Our method requires few cell numbers and no purification.

Folding the Spectra Width and Aliasing Peaks

When an atom is placed in a strong magnetic field (B0), the electrons inthat molecule precess in the direction of the applied magnetic field.This precession creates a small magnetic field at the atomic nucleus.The magnetic field at the nucleus (B) is therefore generally less thanthe external magnetic field (B0) by τ.

B=B0(1−τ).

The electron density around each nucleus within a molecule variesaccording to the types of nuclei and the bonds in the molecule. Theopposing field and therefore the effective magnetic field at eachnucleus will vary. In pulsed NMR spectroscopy these differences can bemeasured by applying a radio frequency pulse that causes the nuclearmagnetization to oscillate inducing an electrical current in a coil thatcan be measured. This signal, known as “free induction decay” (FID) isplotted as current with respect to time. By applying a discrete Fouriertransform, the FID can be converted to frequency domain and theresonance frequency of each observable nuclei can be converted tochemical shift (δ) by the equation,

δ=(n−n _(REF))×10̂⁶ /n _(REF)

where n is the resonance frequency of the nucleus and n_(REF) is theresonance frequency of a standard.

Chemical shift is a very precise metric of the chemical environmentaround a nucleus. Unlike M/S and chromatography, NMR is one of the onlymethods that can distinguish molecules that have the same mass butdifferent chemical connectivity. However to utilize this informationultra-high resolution spectroscopy is needed.

In NMR, digital resolution is determined by the sweep width (sw) and thetotal number of data points (TD), such that

Resolution=sw/TD, measured in Hz/point.

SW is the range of frequencies over which NMR signals are to bedetected. Metabolite mixtures contain diverse molecules, and thespectral width necessary to cover all potential carbon chemical shiftsspans over −220 ppm. The large ¹³C-chemical shift window creates adilemma, where in order to have maximum resolution and a broad enough swto encompass all possible chemical shifts one would require anincredibly large number of data points. In practical terms this istime-prohibitive.

To circumvent this, our method can include folding the sw. The sw ispurposefully set to smaller range and if some peaks occur outside thisrange they will appear “folded” at aliased chemical shifts. Foldedspectra can be unfolded by suitable data processing techniques. Forexample, resonance frequencies can be dealiased by expanding thefrequency spectrum and shifting the aliased frequencies by a pre-definedamount, to their actual locations. In some cases, the data acquisitionparameters define the spectral folding windows in a manner that reducesor minimizes any overlap between folded spectral peaks and non-foldedspectral peaks. As such, the folded spectral peaks can be de-aliasedwithout affecting other data in the frequency spectrum, in some cases.Folding spectra decreases the overall number of points required in orderto achieve the maximum resolution possible. Our custom folding strategyand de-aliasing program allows ultra-high resolution spectra with ˜44Hz/point separation.

Random Phase Sampling

As described above converting the FID signal to frequency data requiresFourier transformation. However, for a nuclei rotating at +xmagnetization vector around the Z-axis, the Fourier transform will givepeaks at both +ν and −ν because the Fourier transformation cannotdistinguish between a +ν and −ν rotation of the vector. The most commonmethod to distinguish the sign of the frequency requires sampling thesignal at two different receiver phases (for example 0° and 90°). Formultidimensional NMR this increases the experiment time by a factor oftwo for each dimension. To increase the speed of our analysis weemployed random phase sampling (RPS) where a single phase is used todetect each point but the phase is randomly alternated for differentpoints in the signal. This allows us to resolve the phase of thefrequency but cut the acquisition time in half.

Non-Uniform Sampling and Data Extension

Two-dimensional NMR techniques generate two dimensions of data in thetime domain: a direct domain and an indirect domain. The direct domaindata are generated by running an experiment and collecting an NMR signal(e.g., an FID, an echo, or a stroboscopic signal). In other words, thedirect domain is the time domain of an NMR experiment. The indirectdomain data are generated by systematically varying a time parameter ofthe NMR experiment (e.g., incrementing a delay time), running the NMRexperiment for each value of the parameter, and combining the NMRsignals from all experiments. In other words, the indirect domain is thetime domain of the parameter that is systematically varied.

In some cases, non-uniform sampling can be used in multi-dimensional NMRfor the methods described herein. For example, non-uniform sampling canbe used in the indirect domain to reduce the number of NMR experimentsthat are needed to obtain a particular spectral range and frequencyresolution.

Non-uniform sampling (NUS) can be accomplished by incrementing theindirect domain time parameter systematically and in a non-uniformmanner. In particular, instead of incrementing the time parameter by thesame amount for each successive NMR experiment, the time parameter canbe incremented by an amount that varies depending on one or morefactors. For example, the time delay parameter can be incremented by anamount that changes (e.g., increases or decreases) from experiment toexperiment. Varying the time delay according to a Poissoniandistribution or another nonlinear distribution results in sparselysampled indirect domain data. The missing points in the “sparse” dataset can be calculated using reconstruction methods. The forward maximumentropy reconstruction technique can conserve the measured time-domaindata points and guess the missing data points by an iterative process.The iterative process can include discrete Fourier transformation of thesparse time-domain data set, computation of the spectral entropy,determination of a multidimensional entropy gradient, and calculation ofnew values for the missing time-domain data points with a conjugategradient approach. Since this procedure does not alter measured datapoints, it can reproduce signal intensities with high fidelity and avoiddynamic range problems. In some cases, our method indicates withappropriate sampling schedules NUS has enhanced ability to detect weakpeaks. This is extremely important for metabolite analysis where thereis a large dynamic range between abundant metabolites (milimolarconcentration) and rare metabolites (nanomolar concentration).

During the reconstruction it is possible to further increase theresolution of resonances utilizing “data extension.” In this method,during the reconstruction, the total number of points in the indirectdimension is doubled. In one embodiment, the first half of the timedomain (composed of NUS sampled points) is solved according to standardforward maximum entropy protocols and the second half of the data thatdoes not contain any sampled points is completely built using iterativesoft thresholding. In some cases, this allows a two-fold increase inresolution without affecting acquisition time.

Optional Enhancements to Reduce Background

In general, when using the methods described herein there is no need topurify or isolate the substrate metabolites that are present in thesample from the other molecules in the sample (e.g., by chromatographiccolumn (e.g., Sidelmann, et al. Purification and 1H NMR SpectroscopicCharacterization of Phase II Metabolites of Tolfenamic Acid Drug MetabDispos Jun. 1, 1997 25:725-731) or by other means well known to those ofskill in the art). Thus, in typical embodiments, the only purificationperformed in the method prior to NMR is the separation of themetabolites into a water soluble sample or a non-water soluble sample.

In some embodiments, the concentration of the unlabeled substrate withinthe cell can be reduced for a period of time (e.g., 10 minutes to 4hours) before adding the labeled substrate. This technique can helpreduce background signal. The appropriate time frame can be determinedby testing a range of conditions and monitoring background as comparedto control cells not loaded with labeled substrate.

Automated NMR Analysis

NMR metabolite analysis is tedious and complex. To circumvent theseproblems, a custom “NMR Metabolite Arrays” program was created toautomate the process. As shown in FIG. 4, spectra are automaticallyfirst phased, aligned, and normalized with a spike-in control. Forexample, a known material, such as 4,4-dimethyl-4-silapentane-1-sulfonicacid (DSS), tetramethylsilane (TMS), trimethylsilyl propionate (TSP),4,4-dimethyl-4-silapentane-1-ammonium trifluoracetate (DSA) or other NMRstandard reference compound can be added into each sample, and itsconcentration used as a reference to enable relative quantificationcomparisons between samples. Next, using a custom automatic peak pickingprogram, peak lists can be generated for each sample, where eachresonance peak is converted into an X, Y coordinate with an intensityvalue. A “MASTER PEAK LIST” program that generates a “master” look uptable for all the resonances in the spectra under investigation can thenbe run. This program reads all X, Y points from the individual peak listfiles, removes duplicates within defined tolerances, and writes theresulting set of peaks to a standard output such that all possiblemetabolite resonances under investigation are determined. Depending onthe analysis it is possible to input the entire HSQC data from the HumanMetabolite Database into the master peak-list. However, taking thisapproach requires longer computation times, and in most cases isunnecessary. Thus, creating master look up tables for the spectra thatare specifically being investigated is preferred.

As further shown in FIG. 4, after creating the master look up table,“NMR arrays” are next generated for each sample. NMR arrays consist of alist of all possible metabolites and intensity values for each resonanceunder investigation. They are created by combining the individual testpeak list and master peak list to fill in the intensity for resonancesfor all possible metabolites. If a metabolite is expressed in a testsample, the program will select that intensity value. If it is notpresent, the intensity is set at zero or an arbitrary number. The NMRarrays can then be analyzed via traditional statistical analysisprograms to identify the differentially expressed resonances betweenspectra. The resonance frequencies can then be uploaded directly into adatabase, such as the Human Metabolome Database, to identify whichmetabolites are differentially expressed. Candidate metabolites can thenbe confirmed via additional NMR or M/S experiments.

Differential Expression Analysis

Also provided herein are novel methods for identifying substratemetabolites differentially expressed by at least two populations ofcells, e.g., a first population of cells and a second population ofcells. One of the populations of cells can be from a healthy subject orcell line and used as a control. The methods can use the variousfeatures of the various method steps and techniques described herein andinclude: (a) loading a first and a second population of cells with alabeled substrate (e.g., a ¹³C, ¹⁵N, or ³¹P-labeled substrate); (b)culturing the first and the second population of cells of step a) for asufficient period of time to allow metabolic breakdown of the labeledsubstrate into substrate metabolites; (c) harvesting the substratemetabolites from the first and the second population cells of step (b)to obtain a sample of substrate metabolites from each of the first andthe second cell populations, (d) performing multidimensional NMR on thesamples from each of the first and the second cell populations todetermine the resonance spectra of the metabolized substrate, whereinthe resonance spectra represents the metabolites of the substrate; and(e) processing the resonance spectra using a custom “NMR arrays” program(f) comparing the resonance intensity of the first population of cellswith the resonance spectra of the second population of cells todetermine which resonance spectra are differentially expressed, whereinthe differentially expressed resonance spectra represents differentiallyexpressed metabolites.

In some embodiments, the methods for identifying differentiallyexpressed substrate metabolites between at least two populations ofcells can further include comparing the resonance spectra of step (f)with a data base of known resonance spectra to determine the molecularstructure(s) that the resonance spectra represents, and therebydetermine which specific substrate metabolites are differentiallyexpressed between the first and second, different population of cells.Specific metabolites may be identified in this manner.

The methods described herein are useful for monitoring metabolism in anycell type, as well as in any tissue (e.g., the cell population cancontain a heterogeneous population of cells), and are useful formonitoring metabolism in cells exhibiting any phenotype as compared to acell not exhibiting the phenotype.

Diagnostic Applications

Identified differentially expressed metabolites are indicative of thedifferent phenotype, and their expression can therefore be used todiagnose this phenotype, e.g. to diagnose increased metastaticpotential, or to diagnose insulin resistance, or to diagnose disease,etc.

We discovered N-acetylneuraminic acid (NANA) is more highly expressed inbreast cancer tumor initiating cells. From a diagnostic point of view,the presence of excess NANA can serve as a biomarker for tumorigenicpotential. Once such differential expression is observed, detectionmethods such as antibodies or mass spectrometry can be utilized tomonitor NANA expression (e.g., intracellularly or extracellularly) tohelp identify aggressive tumors. In summary, using the methods describedherein for breast cancer cells, the specific breakdown of glucose wasfollowed and NANA was discovered to be widely upregulated in moremalignant cells. Enzymes were identified in NANA biosynthesis as newtherapeutic targets, and the expression levels of the molecule was foundto correlate with increased migratory potential.

The novel methods can also be applied to patient biofluids (e.g., blood,urine, plasma, and tissue samples) to discover metabolic differencesthat can serve as novel biomarkers for a particular disease.

Methods of Determining Candidate Therapeutic Agents

In various embodiments, the methods for identifying substratemetabolites differentially expressed by at least two populations ofcells can further include the step of identifying a biosynthetic pathwayinvolved in the generation of the substrate metabolites and identifyingproteins/enzymes of the pathway that can be targeted to modulate thedifferential expression of the metabolite. In turn, these proteins andenzymes can serve as candidate targets for modulating the phenotype ofthe cells, e.g., a disease phenotype, metastatic potential, orresistance. Thus, the new methods provide a means for identifyingtherapeutic targets. Once the metabolite is identified and an NMRmetabolite array has been created for a given sample, databases ofbiosynthetic pathways can be screened to identify the pathway ofsynthesis of the metabolite.

For example, the NMR metabolite array can be electronically linked tothe Human Metabolome Database and/or ChemPub, to select possibletherapeutic targets within the metabolite biosynthetic pathways andpropose substrate-based inhibitors using the metabolite itself as a leadscaffold for drug design. A series of differentially expressedmetabolites that can serve as biomarkers have been identified anddescribed herein. Novel therapeutic targets within the biosyntheticpathways as well as FDA-approved drugs that show efficacy in thelaboratory that could be rapidly translated into new applications in theclinic have also been identified.

The cell populations used in the various methods described herein can befrom healthy or diseased subjects. The cells can be from isogenicpopulations. In some embodiments, the first population of cells and thesecond population of cells have different phenotypes (e.g., differ inmetastatic potential, differ in response to insulin, or differ inexpression of disease genes). Differential metabolites expressed in anyphenotype can be assessed as compared to an isogenic cell that does notexhibit the phenotype. Phenotypes are easily identified by those skilledin the art, and include but are not limited to phenotypes associatedwith a particular disease or disorder.

In some embodiments, the first population of cells is a controlpopulation of cells and the second population of cells has beencontacted with a test compound or agent (e.g., after treatment with thecompound or agent). The disappearance of differentially expressedmetabolites that are associated with a particular phenotype serves as anindicator that the test compound or agent is capable of inhibiting thephenotype, for example, inhibiting metastasis, or inhibiting the effectsof the expression of a diseased gene. Alternatively, the appearance of adifferentially expressed metabolite (e.g., one expressed only in normalcells as opposed to diseased cells) serves as an indicator that thecompound or agent is useful for treatment of the disease. Thus, invarious embodiments, the methods for identifying differentiallyexpressed metabolites can be used to screen for compounds or agents thatmodulate a phenotype, which can be used for treatment of disease.

The metabolic consequences of overexpressing or inhibiting a gene ofinterest can be identified using the new methods described herein. Inone embodiment, the method further comprises inhibiting oroverexpressing a gene in one of the cell populations (e.g., the secondpopulation of cells). Similarly, the metabolic consequences ofparticular compounds or agents, e.g., to assess toxicity, also can beidentified.

The test compounds or agents can be, for example, a small molecule, anucleic acid RNA (e.g., siRNA or microRNA), a nucleic acid DNA, aprotein, a peptide, or an antibody. The inhibitors can be selected fromthe group consisting of: a small molecule, a nucleic acid RNA (e.g.,siRNA), a nucleic acid DNA, a protein, a peptide, and an antibody. Inone embodiment, the inhibitor is an inhibitor of an enzyme (e.g., aneuraminidase inhibitor).

Methods of Treating Disorders with Therapeutic Agents

As described in the examples below, through the methods describedherein, CMAS, NANS (also known as sialic acid synthase), and NANAcell-surface expression have been determined to be therapeutic targetsthat decrease migration of cancer cells and prevent tumor initiation invivo. Thus, in another aspect, the present disclosure includes newmethods for treating disorders such as cancer (e.g., by inhibitingmetastasis and/or blocking tumor initiation) in a subject byadministering an effective amount of an inhibitor of the targetsdiscovered using the new methods described herein. In particular, themethods include administering to a subject in need thereof atherapeutically effective amount of an inhibitor of CMAS or NANS, or atherapeutically effective amount of an inhibitor or agent that lowersNANA expression. For example we have identified candidate CMASinhibitors including a molecule we designed and synthesized, termedF-NANA, and already FDA approved influenza drugs, including Relenza andTamiflu. The efficacy of these drugs in in vivo mouse models is beingtested and because Relenza and Tamiflu have already been evaluated forsafety in human subjects accelerated FDA approval for an investigationalnew drug is possible.

Application of the New Methods to Personalized Medicine

As mentioned, the new methods described herein have been successfullyshown to characterize the metabolic differences in several oncologymodels using both cell lines and primary tissue. The methods have thepotential to profoundly affect the strategies for designing noveltherapeutic intervention and could be lay the foundation for ametabolite-based approach for “personalized medicine.”

According to the National Institutes of Health, “personalized medicine”is a practice of medicine that uses an individual's genetic profile toguide customized decisions made with regard to the prevention, diagnosisand treatment of disease in that individual. To date, most efforts relyon genomic information to identify DNA mutations, amplifications, ordeletions. Rarely, however, is a disease the result of a single geneticlesion, and it is often not obvious how genetic variations will manifestthemselves. However, non-genetic changes, including epigeneticdifferences, can also have profound effects on gene expression andcellular properties. Further, many commonly mutated genes, such as incancer, do not have small molecule inhibitors and are often termed“undruggable targets.” Establishing the metabolic profile of key cellsin an individual suffering from a disease, with the methodologydescribed herein, could provide powerful information useful indiagnosing, treating and monitoring the disease state of an individual.

As noted, diseases are incredibly complex and heterogeneous, and theeffect of a misregulated gene or genes is not always obvious; making itdifficult to design the best therapeutic strategies for intervention.Metabolism on the other hand is the end product of the genome. Using thenew platforms described herein, the metabolic differences in a specificpatient can quickly highlight functioning or non-functioning pathways inthat individual. Further, metabolic pathways have been extensivelystudied and in many cases inhibitors for metabolic enzymes, suchantimetabolites, substances bearing a close structural resemblance tothe natural metabolite, already exist and are already used in theclinic. Examples of such potential therapeutic discoveries based on thedifferential expression NMR analysis are also presented in detail belowin the Examples. The aforementioned differential expression, for exampleof metabolites, can be a powerful tool for diagnosing, monitoring, andtreating disease in a patient on an individual, customized basis.Examples of analysis using this protocol to deduce novel metabolismpathways and differential metabolite expression between wildtype anddisease tissue are described herein.

EXAMPLES

The invention is further described in the following examples, which donot limit the scope of the invention described in the followingexamples, which do not limit the scope of the invention described in theclaims.

The following examples discuss the novel protocols and the subsequentnovel analysis methods that are used to efficiently determine thedifferential expression of metabolites in normal and disease statetissue. The results of such differential expression are described hereinand are shown to be utilized in the design and identification ofpotential small molecule therapeutics.

Example 1: Methodology for the Screening Platform Utilizing NMR

Preparation of Biological Sample:

FIG. 1 provides an overall description of the platform. Approximately 20million cells were used for each sample (however it is possible to useas few as 2 million). Before harvesting, ¹³C-labeled precursors (glucoseand glutamine in these examples) were added directly to the media andallowed to incubate for a user-defined amount of a time and in this case4 hrs. After aspirating the media and washing two times withphosphate-buffered saline (PBS), the cells were again counted andcollected. An equal number of cells with no label were also harvested toserve as a ¹³C background control. Cells were lysed by the addition ofice-cold methanol, and an aqueous extraction was performed by addingequal parts water and chloroform. After centrifugation the water solubleand organic metabolites were separately collected, dried, and storeduntil ready to be further analyzed. No additional purification wasperformed.

Acquisition of Data with NMR:

2D NMR spectroscopy was employed primarily relying on heteronuclearsingle quantum correlation (HSQC) to identify metabolites. HSQCexperiments provide one-bond correlation between a heteronucleus (¹³C inthe following examples, although other isotopes are feasible) and aproton. Crosspeaks arise due to transfer through the relatively largeone-bond heteronuclear coupling, making it possible to identify shiftsof directly attached nuclei. The unique chemical environment of eachcarbon atom paired to a proton gives rise to characteristic chemicalshifts specific to a given metabolite. Reference HSQC spectra ofpurified metabolites (commonly available) were used for comparison. Forexample, at present the Human Metabolome Database (HMDB) containsinformation on 40,260 metabolite entries many with HSQC data.

To overcome the traditional drawbacks of 2D NMR metabolite profiling(large sample requirements and long acquisition times), severaladditional techniques were used to improve the resolution and reduce thetime required for the analysis. As described above, in a first step, todecrease the amount of material needed cells were supplemented with¹³C-labeled precursors (glucose, glutamine, pyruvate and amino acidswere used but other substrates and other isotopes are also possible).Theoretically, this should decrease the cell number required by a factorof ˜100. Partly due to varying ionic strengths, this does not alwaysscale perfectly linearly and if there are no constraints regarding thesample size (i.e., using cell lines), it is recommended here that about2-20 million cells be used for each analysis.

Example 2: Rapid Ultra High Resolution NMR Data Acquisition

Folding the Spectra:

While the aforementioned sample preparation alleviated the physicaldemands on the amount sample, the long acquisition time required torecord high resolution 2D NMR spectroscopy was still a concern. Tocombat this, a multi-prong approach was taken: “folding” the spectrawidth, using random phase sampling (RPS), implementing non-uniformsampling (NUS) techniques and data extension in the analysis.

As described above, the spectral width (sw) is the range of frequenciesover which NMR signals are to be detected. Metabolite mixtures containdiverse molecules, and the spectral width necessary to cover allpotential carbon chemical shifts spans over −220 ppm. In FIG. 2A, theHSQC spectra for all water soluble metabolites in KRAS mutant pancreaticcancer cells is shown. In this instance the 13C-sw spans 220 ppm and atotal 1024 points were collected. By solving the equation,resolution=SW/TD (where TD is the total data points), we observed theresolution is limited to ˜107 Hz/point. However, if you examine FIG. 2Acloser, the majority of metabolites run along a diagonal and most thespectrum is empty. Collecting points along the entire sw greatlydiminishes resolution, and as acquisition time is the reciprocal ofresolution, experimental time is also wasted. As such we graduallydecreased the sw to increase the resolution. FIG. 2B shows an HSQC ofthe same sample with 140 ppm sw and as a result resolution of ˜68Hz/point, FIG. 2C with sw of 110 ppm and as a result 54 Hz/pointresolution and FIG. 2D with 90 ppm sw generates ultra-high resolution of˜44 Hz/point. In each of the folded spectra the aliased peaks are easilyidentifiable and the true chemical shifts can be can be back calculatedusing the following equation (δobs=δ+sw). Of note, the maximumresolution due to C—C scalar coupling is ˜35 Hz. Using our foldingstrategy we are able to obtain ultra-high resolution spectra.

Of note, to provide optimal flip angles uniformly across the largecarbon spectral width, broad band adiabatic shaped pulses were utilizedfor all 180 degree pulses along the carbon channel. This is especiallyimportant for enabling efficient coherence transfer among scalar coupledspins.

Non Uniform Sampling and Extension of Data:

The measured “free-induction decay” (FID) of an NMR sample is created bythe oscillating current generated by the precession of all magnetizedbonds. This signal decays due to nuclei in other molecules creatingspin-spin decoherence. The rate at which this occurs is known as thetransverse relaxation rate (T2). For any NMR experiment it is widelyviewed that to obtain maximum resolution one should collect points inthe indirect dimension close to 1.2*T2. However, metabolites moverapidly with molecular motion correlation times on the average of 10⁻¹²to 10⁻¹¹ sec. Due to this rapid movement, for many metabolites there islittle spin-spin decoherence and the T2 rates are almost infinitelylong. Thus collecting ultra-high resolution metabolite data istheoretically possible but in practice it would require extremely longmeasurement times and in most experiments only a subset of data iscollecting sacrificing resolution for speed.

By employing non-uniform sampling (NUS) techniques that are outlined inFIG. 2E we were able to not only greatly increase the resolution of ourdata but also increase the speed at which we recorded high resolutionspectra. For example using the same sample we could perform a uniformlysampled experiment with 128 indirect points in 5 hours with Xresolution, using NUS we could collect 10% of 1024 points and inequivalent time to generate a spectra with 8× resolution, alternativelywe could sample 10% of 512 points increasing the resolution by 4× anddecreasing the acquisition time by ˜40%.

The Poisson-gap distribution was selected for the sampling schedulefollowed by forward maximum (FM) entropy reconstruction. Metabolitemixtures contain molecules at various concentrations, and this has beenshown to be the most effective method in detecting weak peaks. Inaddition, to further enhance our resolution we created a“data-extension” add-on, in which before reconstruction the total numberof points in the indirect dimension is artificially doubled. The firsthalf of the NUS data set is reconstructed using the sparsely sampleddata and filling in the missing points according to FM reconstruction.The second half of the data is completely built using iterative softthresholding. As shown in FIG. 2F this increased our resolution by2-fold without affecting acquisition time.

Analysis of Metabolites in the Water and the Organic Layer:

While it is not necessary to follow each step, this method allows for afull metabolic profile of both water soluble or organic metabolites.FIG. 3 shows the HSQC of water based metabolites (FIG. 3A) and organicbased metabolites (FIG. 3B) from the same million p53 deficient lungcancer cells and each experiment required only 1 hour of acquisitiontime. Equivalent spectra using 13C-natural abundance and standard NMRtechniques for the same sample amount would require several days. Bothspectra are extremely well resolved, making it easy to identifymetabolite resonance peaks. Importantly many M/S methods have struggledto accurately detect lipids, using our method, as shown in FIG. 3B, theresonances from organic molecules are readily identifiable.

Example 3: Analyzing Metabolite NMR Data

NMR Analysis: As summarized in FIG. 4, a custom “NMR Metabolite Array”program was created to automate the NMR analysis process using themethod herein described. First, the spectra are phased, aligned, andscaled with an internal control. 1 mM4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) was added into eachsample, and its concentration was used as a reference to allow relativequantification comparisons between samples to be made. We created anautomatic peak picking program to generate peak lists for each sample,where each resonance peak was converted into an X, Y coordinate with anintensity value. Next we created a “MASTER PEAK LIST” program thatgenerates a “master” look up table for all the resonances in the spectraunder investigation and was subsequently run.

In short, this program reads all X, Y points from the individual peaklist files, removes duplicates within defined tolerances and writes theresulting set of peaks to a standard output. Depending on the analysisit is possible to input the entire HSQC data from the Human MetaboliteDatabase into the master peak-list. Taking this approach requires longercomputation times, and in most cases is unnecessary. Creating masterlook up tables for the spectra that are specifically being investigatedis preferred.

Next, NMR arrays are generated for each sample in which the individualpeak list and master peak list were combined to fill in the intensityfor resonances for all possible metabolites. If a metabolite isexpressed in a test sample, the program will select that intensityvalue. If it is not present, the intensity is set at zero or anarbitrary number. The NMR arrays can now be analyzed via traditionalstatistical analysis programs to identify the differentially expressedresonances between spectra. The resonance frequencies can then beuploaded directly into the Human Metabolome Database to identify whichmetabolites are differentially expressed. Candidate metabolites can thenbe confirmed via additional NMR or M/S experiments.

Example 4: Analysis of Differentially Expressed Metabolites: A NovelApproach

Background Correction:

To monitor the flux of a given precursor, a separate spectra with anequal number of cells with no labeled precursor was recorded. Thespectra from the unlabeled cells represent the ¹³C background within thecell and can be subtracted from the test spectra to specifically followthe metabolic breakdown of the ¹³C labeled substrate. Glucose andglutamine are two of the main energy sources within a cell, and themetabolic breakdown of each precursor is well characterized. To examinethe flux of glucose and glutamine into specific pathways, ¹³C-¹H HSQCspectra of equal number of breast tumor initiating cells with no labeledprecursor and either ¹³C-glutamine or ¹³C-glucose added as substratewere recorded. As shown in FIG. 5A with this limited number of cells,very few signals arise from the ¹³C background, and the glutamine (5B)and glucose (5C) spectra are quite distinct. By creating NMR arrays wewere able to convert the complex NMR data into standard text files. Theintensity values for resonances in the unlabeled sample were subtractingfrom the matching signals in the glutamine or glucose arrays. As shownin FIG. 6, by plotting the intensity value and resonance metabolite IDfrom the NMR array it is possible to identify changes specific toglucose or glutamine flux through a given sample. The X-axis lists allthe resonance metabolite ids for every metabolite identified from theHSQC spectra of FIG. 6b and FIG. 6c . The Y-axis highlights how theintensity of each resonance changes in each condition. As expected, itis clear glutamine and glucose cause flux into different metabolicpathways. The resonance metabolite IDs correspond to specific ¹³C-¹Hchemical shifts which can uploaded into the Human Metabolome Database toidentify the differential metabolites.

Example 5: Identifying Differentially Expressed Metabolites in TripleNegative Breast Cancer Tumor Initiating Cells

Protocol:

Originating from the same normal breast tissue, BPLER and HMLER cellswere transformed with identical genetic factors but were propagated indifferent culture media. BPLER are highly tumorigenic and have anincreased metastatic potential over that of HMLER cells. Less than 50BPLER cells injected into the mammary fat pad of a mouse result in thedevelopment of a tumor, while more than 10̂⁶ HMLER cells are required toform a tumor in vivo (Table 2, below). BPLER cells are a model cell linefor triple negative breast cancer tumor initiating cells, and BPLERtumors histologically resemble that of triple negative breast cancerpatients. According to the protocol about 20 million BPLER and HMLERcells were cultured in the presence of uniformly labeled ¹³C-glucose,and subsequently harvested and lysed. The aqueous layer was thencollected, dried, and re-dissolved in ultra-pure D2O and ready for NMRanalysis. The organic layer was stored for future examination.

TABLE 2 Tumors Formed Cells BPLER HMLER MCF7 5 × 10⁴ 4/4 0/4 0/4 5 × 10³4/4 0/4 0/4 5 × 10² 4/4 0/4 0/4 5 × 10 4/4 0/4 0/4

Using the new platform methodology described herein, the rapid,unbiased, ultra-high resolution NMR metabolite screening was performed.Examples of resulting ¹³C-¹H HSQC for BPLER and HMLER cells are shown inFIGS. 7A and 7B respectively.

Results:

Using our custom NMR analysis program, the resonances in each spectrawere converted into NMR arrays. FIG. 7C summarizes the information. Bycombining all replicates from both cell lines approximately ˜2100resonances were identified. The metabolite ID of each resonance islisted on the X-axis. The relative intensity of each resonance isplotted on the y-axis. From this analysis we observed a high degree ofsimilarity between the metabolite resonances of each cell line (>75% ofthe resonances were present in both cell lines). However severalresonance peaks that were common to both cell lines had variedexpression, while some peaks were unique to HMLER cells and othersspecific to BPLER. To confirm the arrays accurately reflected the HSQCdata, as shown in FIG. 7D the region of the HSQC spectra that waspredicted to have resonances specific to BPLER cells in the NMR arraywas expanded and indeed resonances were only found in the BPLER spectra.

Using the NMR arrays we were able to quickly identify resonances thatwere specifically enriched in BPLER tumor initiating cells. Table 3highlights the top resonances most enriched in BPLER tumor initiatingcells. Shown are the metabolite IDs from the array, as well as thecorresponding ¹³C-¹H data.

TABLE 3 Metabolite Resonance ID 13C 1H 589 62.993 3.842 1951 65.98 3.823283 68.633 4.248 1381 69.973 4.002 402 42.028 2.193 1602 72.951 3.739245 72.849 3.953 1333 42.015 1.814 119 106.501 6.151

These resonances were input into the Human Metabolome Database and 6 ofthe 9 resonances, highlighted in yellow were predicted to be fromN-acetylneuraminic acid (NANA), strongly suggesting NANA is themetabolite corresponding to the differentially expressed resonancesidentified in the NMR arrays.

Several additional steps were taken to confirm NANA is indeedupregulated in BPLER tumor initiating cells. First, ¹³C-¹H HSQC of pureNANA shown in FIG. 8A contains cross peaks at approximately the samelocation as those found over-represented in the BPLER spectra. Second,we designed custom NMR pulse programs to specifically examine NANA. InNANA biosynthesis the C2 of glucose and a nitrogen atom of glutamine arejoined to form a carbon-nitrogen bond. As such BPLER cells wereincubated with ¹³C—C2 glucose and ¹⁵N-glutamine, and a HCN experimentwas recorded to detect metabolites resonances that contain a hydrogen,connected to a carbon, that is also connected to a nitrogen atom (HCN).In this experiment, as shown in FIG. 8B BPLER cells contain adifferentially expressed resonance at the same position as would beexpected in the NANA standard. Lastly, mass spectrometry experimentsshown in FIG. 8C, confirmed NANA is approximately 7-fold higher in BPLERtumor initiating cells by performing multiple reaction monitoring LC/MSusing electrospray in the negative mode. The reported values are thearea under the curve for NANA expression in each cell line.

Using Results of Differentially Expressed Metabolites to DevelopDiagnostics:

By following glucose flux within BPLER cells (i.e. subtractingbackground ¹³C and tracing specific breakdown of glucose), the tumorinitiating cells were observed to divert part of their glucosemetabolism to NANA production. NANA is 9-carbon sugar that is oftenincorporating onto the cell surface of glycoproteins. Previous reportsidentified that wheat-germ agglutinin (WGA) has a strong affinity forNANA-modified proteins. Using rhodamine labeled WGA, we preformedimmune-fluorescent microscopy shown in FIG. 9 and observed BPLER cellshave increased NANA expression on their cell surface. Thus, NANA itselfrepresents a novel diagnostic to specifically identify breast tumorinitiating cells, and WGA, and similar molecules that specificallyrecognize NANA and NANA modified molecules could provide new tools todetect & isolate tumor initiating cells or those with increasedmalignant potential.

Example 6: Identification of a Target Utilizing the Differential NMRData

NANA is a sugar that is often incorporated onto cell surface proteins.Shown in FIG. 10 NANA is derived from glucose by about 15 distinctenzymes including key enzymes NANS and CMAS. Using CelTiterGlo, awell-known assay for cell viability, it was observed that knockdown ofNANS or CMAS had little to no effect on cell viability or proliferationof the cells (FIG. 11), whereas knockdown of PLK1 enzyme lead to totalcell death.

However, NANA is incorporated on the cell surface of several proteinsinvolved in cell adhesion, and loss of NANA was suspected to affect cellmotility. Using a cell migration assay, cells were cultured in adual-chamber containing small pores at the bottom of the top chamber,malignant cells (especially those with metastatic potential) are able tomigrate through the pores and form colonies. As expected, shown in FIG.12 BPLER have an increased migration rate as compared to HMLER cells dueto their increased turmorigenic properties. However, the knockdown ofNANS or CMAS completely abolished the cells' ability to migrate to thelower chamber, while cells transfected with control siRNAs maintainednormal migration. To confirm NANA expression directly influencesmotility, a rescue experiment was performed in which cells transfectedwith siRNAs against NANS or CMAS were supplemented NANA. In the presenceof NANA they were able to partially restore the migration phenotype.These results of several migration studies are quantified in FIG. 12B.These results suggest NANA expression is crucial for cell migration andcould be important for metastasis. Monitoring NANA expression couldpredict the metastatic potential of a cell. In addition both NANS/CMASare novel key targets to manipulate migration and metastasis.

Example 7: CMAS Increase Cell Migration

As mentioned, the knockdown of NANS and CMAS, key enzymes used togenerate and attach NANA to proteins, had no effect on cellproliferation but greatly reduced the ability of BPLER cells to migrate(shown in FIG. 12). While the mRNA level of CMAS and NANS was equivalentin HMLER and BPLER cells, we observed the protein expression of CMAS isdramatically over expressed in BPLER tumor-initiating cells (FIG. 13).To determine if CMAS expression contributes to the malignant phenotypeof tumor initiating BPLER cells, non-aggressive HMLER cells weretransfected with a plasmid to force the expression of CMAS. Using thepreviously described cell migration assay, as shown in FIG. 14A HMLERcells with enhanced CMAS expression dramatically increased theirmigration potential by several fold compared to the control. Thereciprocal experiment was also performed and stable CMAS knockdown BPLERcells (BPLER-shCMAS1) were created. As expected, shown in FIG. 14B thesecells were not able to form colonies in the same migration assay. Theseresults suggest CMAS protein expression is pivotal in cell migrationand/or metastasis. To date there have been no references to the role ofeither NANS or CMAS in cancer. This could be in part due to mosttechniques relying on sequencing and/or microarray experiments that onlyexamine mRNA levels. This highlights the strength of our method. Byprobing metabolites, NANA was identified as being expressedsignificantly higher in breast tumor-initiating BPLER cells. Bysubsequently probing the enzymes in its biosynthetic pathway, a novel,potent target that could be important for tumorigenicity was discovered.

Example 8: CMAS Impacts Tumor Formation Initiation

To determine how loss of CMAS/NANA expression effects the tumorinitiation in vivo we performed the experiment outlined in FIG. 15A. Asdescribed above stable CMAS knockdown BPLER cells (BPLER-shCMAS1) werecreated that do not express CMAS protein. In our experiment 500,000BPLER cells over expressing an empty vector (control) and 500,000BPLER-shCMAS1 cells (which do not express CMAS) were injected into themammary fat pad of NOD/SCID mice. The goal was to analyze differences intumor size and the number of metastasis between each group. Every threedays for 45 days the mice were examined for palpable tumors, and ifdetected the tumor height, length and width were directly measured tocalculate tumor volume. As shown in FIG. 15B, within 45 days, 4/4 of thecontrol mice had developed large primary tumors. Amazingly, none (0/5)of BPLER-shCMAS1 mice had any palpable tumors. Indeed after 90 days themice injected with BPLER-shCMAS1 cells remained tumor free. Takentogether the in vitro and in vivo data suggest that CMAS is a completelynovel and bona fide therapeutic target for cancer.

Example 9: Using the NMR Data to Identify and Design Therapeutic Agentsfor Breast Cancer Therapy

Enzymes such as CMAS are ideal candidates for small molecule druginhibition. The enzyme mechanism of CMAS is FIG. 16A, where CMASactivates the hydroxy group of NANA in a divalent cation dependentmanner, so that it can subsequently attack the alpha-phosphate of anincoming cytidine triphosphate (CTP) molecule to form a cytidinemonophosphate-NANA (CMP-NANA) intermediate. Using the NANA scaffold, asubstrate-based analog replacing the hydroxyl group with a fluorine wasdesigned and synthesized (FIG. 16B). This substitution shouldtheoretically maintain the ability of NANA to bind CMAS however, preventthe enzymatic reaction. In the presence of the CMAS inhibitor, using thepreviously described cell migration assay, it was shown that BPLER cellsare no longer able to migrate (FIG. 16C). Using the NANA scaffold, wewere able to rapidly design and synthesize a substrate based inhibitorthat has effect in cell lines.

The F-NANA derivative synthesized had a slight chemical likeness to theFDA approved drugs Relenza and Tamiflu (FIG. 17). Relenza and Tamifluare both designed to inhibit the influenza enzyme neuraminidase.Neuraminidase specifically cleaves NANA molecules on the cell-surface tofacilitate viral entry into the cell. Neuraminidase and CMAS share NANAas a substrate, and hence these known influenza therapeutics weresuspected to be inhibitors of CMAS. As shown in FIG. 17b , Relenzatreatment of BPLER cells blocked cell migration. Both Relenza andTamiflu are already FDA approved, marketed therapeutics, and pendingpositive results in mouse models, a rapid entry to clinical trial forcancer indications.

Neuraminidase itself is known to remove NANA from the cell surface. Wesuspected neuraminidase could be used to remove NANA from the surface ofmalignant cells and just like siRNAs against CMAS exert a similar effecton migration and tumor initiation. Pre-incubation of BPLER cells withactive neuraminidase enzyme diminished NANA expression as determined byrhodamine labeled wheat germ agglutinin (WGA) microscopy (FIG. 18A). Inaddition, BPLER cells treated with neuraminidase were no longer able tomigrate in the migration assay (FIG. 18B). Neuraminidase and flu-likemolecules, including empty virions, may represent an innovative way toboth detect tumor-initiating cells (influenza virions have a highaffinity for NANA) and inhibit tumor initiation and metastasis byremoving NANA from tumor populations.

OTHER EMBODIMENTS

It is to be understood that while the inventions have been described inconjunction with the detailed description thereof, the foregoingdescription is intended to illustrate and not limit the scope of theinventions, which are defined by the scope of the appended claims. Otheraspects, advantages, and modifications are within the scope of thefollowing claims.

What is claimed is:
 1. A method for monitoring metabolism of a substratewithin a given type of cell in a sample, the method comprising: a.culturing a given type of cell of a first sample with a substrate for asufficient period of time to allow metabolic breakdown of the substrateinto substrate metabolites, wherein at least a portion of the substrateis optionally labeled with a nuclear magnetic resonance (NMR) stableisotope; b. harvesting the substrate metabolites from the cells of step(a) to obtain a second sample of substrate metabolites; and c.performing multi-dimensional NMR on the second sample of step (b) todetermine a resonance spectrum of the metabolized substrate, wherein theresonance spectrum represents the metabolites of the substrate, andwherein the multi-dimensional NMR comprises any one of the followingtechniques: spectral width folding, random phase sampling, non-uniformsampling, and data extension for enhanced dynamic range datareconstruction.